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We investigated the dimensionality of various indicators of reading prosody, and the relations of word reading 
and listening comprehension to the identified dimension(s) of reading prosody, using longitudinal data from 
Grades | to 3. A total of 371 English-speaking children were assessed on oral text reading, word reading, and 
listening comprehension in the fall and spring of each year (i.e., 6 waves of data). From oral text reading, 
reading prosody was evaluated on pause structures (pause duration, pause frequency) and pitch (intonation 
contour, Fy change) using spectrographic analysis, and on expressiveness, smoothness, phrasing, and pacing 
using the Multi-Dimensional Fluency Scale (MFS). A bifactor structure described the data best across the 6 
waves, composed of (a) a ratings and pause general factor, which captured common variance among MFS, 
pause frequency, and pause duration; (b) ratings (MFS) and pause specific factors, which captured variance 
over and above the ratings and pause general factor; and (c) a separate pitch factor, which captured variance 
in intonation contour and Fy change. Word reading and listening comprehension were related to the identified 
dimensions of reading prosody, but when they were in a model together, word reading, not listening 
comprehension, was uniquely related to reading prosody across the six waves. These results indicate that 
reading prosody is multidimensional and that a pitch factor is a dissociable skill from the general ratings and 
pause prosody. Furthermore, word reading is the primary driver for the development of various dimensions 
of reading prosody, at least for children in primary grades. 


Educational Impact and Implications Statement 

Reading prosody, reading texts with appropriate expression, has been widely considered an important 
feature of text reading fluency. We found that multiple aspects and indicators of reading prosody are best 
described as a multidimensional construct composed of a pause and ratings dimension and a pitch 
dimension. Children’s word reading and listening comprehension skills were both related to these 
dimensions of reading prosody, but word reading had a consistent and independent relation. These results 
indicate the importance of word reading development for expressive reading for children in primary 
grades. 


Keywords: listening comprehension, Multi-Dimensional Fluency Scale, reading prosody, spectrograph, 
word reading 
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Fluent reading, widely known as text or oral reading fluency, 
has garnered substantial attention in research and educational 
settings. Definitions of text reading fluency vary (Kuhn, 


Schwanenflugel, Meisinger, Levy, & Rasinski, 2010; National 
Institute of Child Health and Human Development, 2000; Wolf & 
Katzir-Cohen, 2001), but they typically include the following three 


Editor’s Note. 
article —SG 


Kathy S. Binder served as the action editor for this 


© Young-Suk Grace Kim, School of Education, University of Cal- 
ifornia, Irvine; ® Jamie M. Quinn, Florida Center for Reading Re- 
search, Florida State University; ® Yaacov Petscher, College of Social 
Work, Florida State University. 


We thank participating schools, teachers, and children. This research 
was supported by Grants R305A120147 and R305A180055 by the 
Institute of Education Sciences, US Department of Education. The 
content is solely the responsibility of the authors and does not neces- 
sarily represent the official views of the funding agency. 

Correspondence concerning this article should be addressed to Young- 
Suk Grace Kim, School of Education, University of California, Irvine, 
3455 Education Building, Irvine, CA 92697. E-mail: Youngsk7 @uci.edu 
or young.kim @uci.edu 


publishers. 


gical Association or one of its allied 


o 
Dn 
Ss 


pyrighted by the American Psycholo 


d solely for the perso 


This document is coy 


This article is i 


2 KIM, QUINN, AND PETSCHER 


aspects—text reading accuracy, text reading speed, and expression 
(i.e., reading prosody). Clear in this is a recognition of reading 
prosody as one of the defining features of fluent reading (Dow- 
hower, 1991; Kuhn & Stahl, 2003; Schreiber, 1991; Wolf & 
Katzir-Cohen, 2001). In fact, reading prosody is hypothesized to 
be “‘at the heart of the development of reading skill” (Kuhn et al., 
2010, p. 239) and is related to reading comprehension (e.g., Ar- 
cand et al., 2014; Binder et al., 2013; Calet, Gutierrez-Palma, & 
Defior, 2015; Groen, Veenendaal, & Verhoeven, 2019; Klauda & 
Guthrie, 2008; Miller & Schwanenflugel, 2006, 2008; Schwanen- 
flugel, Hamilton, Wisenbaker, Kuhn, & Stahl, 2004; Veenendaal, 
Groen, & Verheoeven, 2014). However, most previous studies on 
text reading fluency have focused on the accuracy and speed 
aspects of connected text reading (text reading efficiency to be 
precise; Baker et al., 2008; Baker, Park, & Baker, 2012; Daane, 
Campbell, Grigg, Goodman, & Oranje, 2005; Fuchs, Fuchs, Hosp, 
& Jenkins, 2001; Jenkins, Fuchs, van den Broek, Espin, & Deno, 
2003; Kim, 2015; Kim & Wagner, 2015; Kim, Park, & Wagner, 
2014; Kim, Petscher, Schatschneider, & Foorman, 2010; Kim, 
Wagner, & Foster, 2011; Kim, Wagner, & Lopez, 2012; Riedel, 
2007; Roehrig, Petscher, Nettles, Hudson, & Torgesen, 2008; 
Silverman, Speece, Harring, & Ritchey, 2013; Tilstra, McMaster, 
van den Broek, Kendeou, & Rapp, 2009) with considerably less 
empirical attention to reading prosody. 

To address this gap in the literature, our goals in the present 
study were to investigate (a) the dimensionality of various indica- 
tors of reading prosody and (b) the relations of word reading and 
oral language comprehension (i.e., listening comprehension) to the 
identified dimension(s) of reading prosody, using longitudinal data 
from children in primary grades in elementary school (from Grade 
1 to Grade 3). Note that in the present study, we focus on prosody 
in reading connected texts that is part of the definition of text 
reading fluency, not on prosodic sensitivity in isolated words, 
known as word prosody or prosodic sensitivity (see Calet, 
Gutierrez-Palma, Simpson, Gonzalez-Trujillo, & Defior, 2015; 
Kim & Petscher, 2016; Schwanenflugel & Benjamin, 2017; Whal- 
ley & Hansen, 2006; Wood, Wade-Woolley, & Holliman, 2009). 
Also note that we use and differentiate the terms text reading 
fluency and text reading efficiency. Although the definition of text 
reading fluency includes efficiency (accuracy and speed) and 
prosody of reading connected texts, most studies have operation- 
alized text reading fluency as text reading efficiency, excluding 
reading prosody. In a similar vein, text reading fluency is used over 
the widely used broad term, reading fluency, because theoretically 
and empirically text reading fluency is a differentiated construct 
from word reading fluency (Jenkins et al., 2003; Kim, 2015; Kim 
& Wagner, 2015). 


Reading Prosody 


Reading prosody refers to prosodic rendering of the written text 
when reading aloud. Prosody concerns suprasegmental rhythmic 
and melodic features of speech, including pitch (intonation), stress 
(loudness), and duration (Dowhower, 1991). Pitch changes at the 
end of a sentence—typically declining pitch at the end of a 
declarative sentence, and rising pitch at the end of a yes—no 
question (e.g., Schwanenflugel et al., 2004). Pauses are expected in 
meaningful semantic units (e.g., phrasal unit) as well as between 
sentences. Intrasentential pauses are usually shorter than intersen- 


tential pauses (Cooper & Paccia-Cooper, 1980; Schwanenflugel et 
al., 2004). Individuals with skilled reading prosody read texts with 
appropriate raising and lowering of pitch, phrasing or grouping of 
words into meaningful units, lengthening of certain vowels, and 
duration of pauses (Binder et al., 2013; Dowhower, 1991; Groen et 
al., 2019; Miller & Schwanenflugel, 2006; Schwanenflugel et al., 
2004). 

Clear in this brief review is that there are multiple aspects of 
reading prosody (e.g., pitch, pause structure, phrasing, stress,). 
These different aspects of reading prosody have been measured 
using rating scales and spectrographic analyses. In a rating scale, 
students’ oral reading is evaluated against a priori established 
criteria on overall expressiveness or on multiple aspects (e.g., 
phrasing). For example, the National Assessment of Education 
Progress (NAEP) Oral Reading Fluency Scale evaluates students’ 
overall reading expressiveness primarily based on one’s ability to 
read in meaningful phrase units, and this scale was shown to be 
both reliable and related to students’ reading skills (Pinnell et al., 
1995; Sabatini, Wang, & O’Reilly, 2019). Other rating scales 
evaluate multiple aspects of prosody. In the Comprehensive Oral 
Reading Fluency Scale (Benjamin et al., 2013), students’ oral 
reading is evaluated on rate and accuracy (on a scale of | to 8), 
expressive intonation (on a scale of 1 to 4), and natural pausing (on 
a scale of 1 to 4). The most widely used rating scale to date is the 
Multi-Dimensional Fluency Scale (Rasinski, 2004; Rasinski, 
Rikli, & Johnston, 2009; Zutell & Rasinski, 1991). The Multi- 
Dimensional Fluency Scale evaluates reading prosody on a scale 
of 1 to 4 in the following four aspects: (a) expression and volume, 
(b) phrasing, (c) smoothness, and (d) pace. The expression and 
volume aspect focuses on sounding or reading like natural lan- 
guage, phrasing evaluates choppiness and intonation, smoothness 
evaluates pauses and smooth rhythm, and pace evaluates conver- 
sational pace. The Multi-Dimensional Fluency Scale has been used 
widely in previous studies and has been shown to be reliable and 
valid in several languages, including English (Paige, Rasinski, & 
Magpuri-Lavell, 2012), Spanish (Gonzélez-Trujillo, Calet, Defior, 
& Gutiérrez-Palma, 2014), Dutch (Veenendaal, Groen, & Verho- 
even, 2014, 2015), and Turkish (Yildiz et al., 2014). 

Reading prosody has been also examined using spectrographic 
analysis. Spectrographic analysis allows researchers to measure 
specific aspects of reading prosody with precision. For example, 
pitch can be measured by marking the difference between funda- 
mental frequency (Fy) in the last peak and Fy at the end of the 
sentence in hertz (Hz). Duration of intersentential and intrasenten- 
tial pauses can be measured in milliseconds, and intensity (or 
loudness as a measure of stress) can be measured in decibel (dB). 
A series of studies on reading prosody using spectrographic mea- 
surements have been conducted by Schwanenflugel and her col- 
leagues. For example, they measured reading prosody in terms of 
duration (intersentential pause duration, intrasentential pause du- 
ration) and pitch (sentence-final Fy change, and child-adult F, 
match), and examined the relations of reading prosody to reading 
skills (Schwanenflugel et al., 2004). Other studies revealed that 
more advanced readers in second grade have shorter pauses and 
fewer ungrammatical pauses (which captures the phrasing aspect 
of prosody in studies using a rating scale; Valle, Binder, Walsh, 
Nemier, & Bangs, 2013); that children marked direct quote, con- 
trastive words, and exclamations with a higher pitch than when in 
an unmarked context (Schwanenflugel, Westmoreland, & Benja- 
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min, 2015); that reading prosody values vary depending on text 
complexity (Benjamin & Schwanenflugel, 2010); and that various 
indicators such as inappropriate (or ungrammatical) pauses, adult- 
like intonation contour (i.e., vocalic nuclei), and pitch (Fy) change 
are related to reading comprehension (Benjamin & Schwanenflu- 
gel, 2010; Miller & Schwanenflugel, 2006, 2008; Schwanenflugel 
et al., 2004, 2015). Alvarez-Cafiizo, Sudrez-Coalla, and Cuetos 
(2015) also examined similar indicators using spectrographic mea- 
surements in Spanish and found that children with poor reading 
comprehension skills had poor reading prosody. 

In the present study, we expand our understanding of reading 
prosody by examining the dimensionality of multiple, widely used 
reading prosody indicators (e.g., intonation, pitch, pauses, phras- 
ing, smoothness, pace), using both spectrographic analysis and a 
rating scale. Although these multiple aspects of reading prosody 
have been examined in previous studies, to our knowledge, few 
studies have explicitly investigated the dimensionality of these 
diverse indicators. One exception is Benjamin et al.’s (2013) study 
in which reading prosody was measured by ungrammatical pauses 
and sentence-final pitch indicators using spectrographic analysis, 
and a principal component exploratory factor analysis revealed 
two factors: a pitch factor and a pause factor. Other studies also 
suggest that pitch variables may be capturing a somewhat different 
dimension than pause structure variables because pause structure 
variables have weak relations with pitch variables (Benjamin & 
Schwanenflugel, 2010; Binder et al., 2013; Schwanenflugel et al., 
2004, 2015). Given that various aspects (whether measured by 
rating scale or spectrogram approaches) are purported to capture 
the construct of reading prosody, it is important to examine 
whether multiple indicators of reading prosody are best described 
as having a single dimension, multiple dissociable but related 
dimensions, or a bifactor structure with a general factor that 
captures common variance across all the indicators and with spe- 
cific factors (those over and above the general factor; see the Data 
Analytic Strategy below). In the present study, we measured 
multiple aspects of students’ reading prosody using spectrographic 
measurements (intonation contour, sentence-final F, change, in- 
tersentential pause duration, frequency of ungrammatical pauses, 
and total pause frequency) and the Multi-Dimensional Fluency 
Scale ratings (expression and volume, phrasing, smoothness, and 
pace; Rasinski, Rikli, & Johnston, 2009; Zutell & Rasinski, 1991). 


Predictors of Reading Prosody 


If reading prosody is an important part of text reading fluency, 
then what are the contributing skills to reading prosody? Accord- 
ing to the automaticity theory (LaBerge & Samuels, 1974) and the 
verbal efficiency theory (Perfetti, 1992), one apparent skill that is 
necessary for reading prosody development is decoding or word 
reading skill (e.g., Chall, 1996; Kuhn & Stahl, 2003; Schwanen- 
flugel et al., 2004) because automaticity in word reading or de- 
coding allows cognitive resources (e.g., working memory and 
attention) to be available for additional processes such as semantic 
processing and prosodic reading. Extant evidence indeed supports 
a relation of word reading to reading prosody. For example, word 
reading efficiency (accuracy and rate) was very strongly and 
negatively related to intrasentential pause and moderately related 
to pitch change for third graders (Schwanenflugel et al., 2015; also 
see Benjamin & Schwanenflugel, 2010; Miller & Schwanenflugel, 


2008; and Schwanenflugel et al., 2004) and adults with low liter- 
acy skills (Binder et al., 2013). Word reading skill is expected to 
be particularly strongly related to reading prosody in the beginning 
phase of reading development because of its large constraining 
role of reading development. For example, reading prosody mea- 
sured by the Multi-Dimensional Fluency Scale was strongly re- 
lated with nonword reading and text reading efficiency for second 
graders (.72—.78), whereas it was weakly to moderately related for 
fourth graders (.24—.44) learning to read in Spanish (Calet et al., 
2015). 

Another potential skill that might contribute to reading prosody 
development is oral language skill because prosodic reading in- 
volves semantic processing (Kuhn et al., 2010). Prosodic read- 
ing—treading with appropriate raising and lowering of pitch, ap- 
propriate grouping of words in meaningful units, and reading with 
adult-like prosodic contour—would be facilitated by semantic 
processing or meaning construction and integration (comprehen- 
sion; Fodor, 1998; Frazier, Carlson, & Clifton, 2006; Webman- 
Shafran, 2018). Therefore, children’s ability in listening compre- 
hension, which involves semantic processing (Adlof, Catts, & 
Little, 2006; Florit & Cain, 2011; Gough & Tunmer, 1986; Kim, 
2016, 2017, 2020), might be related to reading prosody. This 
relation is likely to emerge after children have developed a certain 
level of word reading skill as children’s word reading skill has a 
large constraining role in comprehension processes during the 
beginning phase of reading development (e.g., Adlof et al., 2006; 
Kim & Wagner, 2015). Studies have reported a weak to moderate 
relation of reading prosody to vocabulary (Arcand et al., 2014; 
Groen et al., 2019; Paige, Rasinski, Magpuri-Lavell, & Smith, 
2014; Ravid & Mashraki, 2007; Veenendaal et al., 2015) and to 
syntactic knowledge (Veenendaal et al., 2015). To our knowledge, 
no prior work has investigated the relation of listening compre- 
hension to reading prosody. 


The Present Study 


The goals of the present study were to address several gaps in 
the literature on reading prosody—its dimensionality, its predictors 
(the relations of word reading and listening comprehension to 
reading prosody), and the potentially changing nature of these 
dimensions and predictors with development, using longitudinal 
data from primary grade children in elementary school. The theory 
and evidence reviewed above suggest that word reading plays a 
large constraining role in reading development. Then, it is reason- 
able to speculate that the relations among identified dimensions of 
reading prosody (e.g., pitch and pause structure) and the relations 
of word reading and listening comprehension to reading prosody 
may change with development. The following were two research 
questions that guided the present study: (a) What is the dimen- 
sionality of various reading prosody indicators (intonation contour, 
sentence-final F, change, intersentential pause duration, frequency 
of ungrammatical pauses, and total pause frequency that are mea- 
sured by spectrographic analysis; and expression and volume, 
phrasing, smoothness, and pace that are measured by the Multi- 
Dimensional Fluency Scale ratings) for children in the lower 
grades of elementary school (i.e., Grades 1 to 3)? Do the relations 
among identified dimensions change with development?; (b) How 
are children’s word reading and listening comprehension skills 
related to the identified dimensions of reading prosody? Do the 
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relations change from Grade | to Grade 3? These questions were 
addressed by using longitudinal data from Grade | to Grade 3. 
The first research question on dimensionality was addressed by 
fitting a series of alternative models shown in Figure | (see Data 
Analytic Strategy section below for details). We hypothesized that 
the reading prosody variables would be either dissociable but 
related factors (Figure 1b) or have a bifactor structure (Figures 
1c—le; see Data Analytic Strategy section below for details). We 
also expected that the relations between various dimensions of 
reading prosody (e.g., pitch and pause structure) would become 
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stronger with the development of children’s word reading skills— 
that is, as the constraining role of word reading decreases, cogni- 
tive resources will be increasingly available for reading prosody, 
which in turn will allow stronger relations between prosody vari- 
ables. The second research question was addressed by including 
word reading and listening comprehension as predictors of the 
dimensions of reading prosody identified in the first research 
question (see Figure 3). We posited that word reading would be 
strongly related to reading prosody across the grades, particularly 
to pause-related prosody (e.g., ungrammatical pause frequency; 


ES jn 
g aa ra \ / de \ 
a ( Ratings) ( Pause | (Pitch — ) 
8 \ y / \ y 
2 Smth | | Pace Phrase be | A ie FOA me Smth Pace Phrase | Expr tae eee | FOA | nue 
8 a. Unidimensional b. Correlated Three factor 
s ger , aa Yin % A — ss 
5 oe [ " ) ( ) james 
g | Ratings } \ Pause ( Pitch isting: ( Pause ( Pitch | 
2 | smtn | | pace | [pnrase] | ex | [fae] |Past] | roa | | oct, | smth | | pace | [Phrase) | expr | [Faule) [Pave] U poq | | 
by ( << \ ( Prosody: 
2 ¥ General_/ \. General / 
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: oe 
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5 ( i Pause ¥: \ 
2 “Sth Pace Phrase 
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e. Correlated Trait Bifactor 
Figure 1. Dimensionality models fit to the prosody data. Residual variances for all indicators were estimated, 


but not shown for figure brevity. All pathways were estimated, and factor variances were fixed at | for model 
identification purposes. Smth = smoothness; Pace = pacing; Phrase = phrasing; Expr = expression and volume; 
Pause Freq = pause frequencies; Pause Dur = pause durations; Fy A = fundamental frequency change; Int 


Cont = intonation contour. 
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Binder et al., 2013; Schwanenflugel et al., 2004, 2015), whereas 
listening comprehension may be related to reading prosody after 
the very initial phase of reading development (e.g., Grade 3). 


Method 


Participants 


The sample students were 371 English-speaking children who 
participated in a 3-year longitudinal study from Grade | to Grade 
3 in a Southeastern state of the United States. Grades 1| to 3 are an 
important period when students are rapidly developing their de- 
coding skills and associated reading prosody as well as language 
skills (e.g., listening comprehension). Data were collected in the 
fall and spring of each year, totaling six waves. The average age of 
students was 6.36 years (SD = 0.53), 7.33 years (SD = 0.52), and 
8.34 years (SD = 0.54) in the fall of Grades 1, 2, and 3, respec- 
tively. Fifty-two percent of students in Grade 1 (n = 192), 46% of 
students in Grade 2 (n = 172), and 39% of students in Grade 3 
(n = 146) qualified for free or reduced lunch, a proxy variable for 
poverty status. The racial/ethnic breakdown was as follows in 
Grade 1: 59.8% White, 25.9% Black, 5.9% Hispanic, 2.4% Asian/ 
Pacific Islander, 0.3% American Indian/Alaska Native, and 5.9% 
identified as two or more races/ethnicities. Slightly less than half 
of the sample was female (n = 180; 48.5%). Human subjects 
approval was obtained from the Florida State University (HSC No. 
2015.16488). 


Measures 


Reading prosody. Children were presented with three grade- 
level passages in each wave and were asked to read each passage 
aloud. After each passage, a simple literal comprehension question 
(e.g., name of a main character in the story) was asked to ensure 
that children read for meaning. Passages were normed in the state 
where the study was conducted prior to the study, and they were 
composed of 155 to 198 words in Grade 1, 187 to 200 words in 
Grade 2, and 200 to 307 words in Grade 3. One passage in each 
wave was used as a linking passage between waves (e.g., one 
passage between Waves | and 2, another passage between Waves 
2 and 3, etc.). Students’ oral reading was digitally recorded (i.e., 
saved as a *.wav file). 

Reading prosody was measured by spectrographic analysis and 
a rating scale. Spectrographic analyses were informed by previous 
studies (Benjamin & Schwanenflugel, 2010; Miller & Schwanen- 
flugel, 2006, 2008; Schwanenflugel et al., 2004) and included the 
following five indicators: (a) vocalic nucleus, (b) sentence-final 
change in Fo, (c) intersentential pause duration (in ms), (d) fre- 
quency of ungrammatical pauses, and (e) total pause frequency. 
Vocalic nucleus is a measure of intonation contour in hertz (pattern 
of pitch changes in the voice; Miller & Schwanenflugel, 2008). 
Sentence-final change in Fo is the difference in hertz from the final 
pitch peak to final Fy. There were three interrogative sentences 
(i.e., sentence 1 for passage 3 in Wave 1 [Have you seen a 
rainbow?], sentence 2 for passage 3 in Wave 5 [Do you know what 
that means?], and sentence 1 for passage 3 in Wave 6 [Are you 
ready for a float trip in a canoe?]), which had positive F, change. 
The positive Fy change values from these sentences were multi- 
plied by —1 to follow the same distributions as Fy change values 


from declarative sentences where F, decreases (i.e., intonation 
goes down). Following previous work, durations longer than 100 
ms between words or phrases were considered pauses and were 
measured by visually marking the spectrograph because one hun- 
dred milliseconds is considered the minimum pause length that can 
be reliably measured (Arcand et al., 2014; Miller & Schwanenflu- 
gel, 2006, 2008). Ungrammatical pauses were inappropriate pauses 
that did not fit into major syntactic boundaries (e.g., clause bound- 
aries) or reasonable phrasal boundaries where a pause would be 
expected (see Benjamin & Schwanenflugel, 2010). Praat software 
(Version 5.4; Boersma & Weenink, 2015) was used to measure 
each of these indicators. For the spectrographic analysis, the first 
three sentences in the oral reading of each passage were used 
because of the resource-intensive nature of the coding and the 
large amount of data (see Miller & Schwanenflugel, 2006; 
Schwanenflugel et al., 2004, for a similar approach). One graduate 
student and three undergraduate students in a speech and language 
pathology program underwent rigorous training and coded data 
using Praat. Similarity reliability coefficients (Shrout & Fleiss, 
1979), which indicated the proximity of the coder’s score for each 
variable to that of the primary coder, ranged from .90 to .99, using 
78 cases. 

Students’ reading prosody was also rated by a widely used scale, 
the Multi-Dimensional Fluency Scale (MFS; Rasinski, 2004; 
Rasinski, Homan, & Biggs, 2009; Zutell & Rasinski, 1991). The 
MFS assesses reading prosody in four areas: (a) expression and 
volume, (b) phrasing, (c) smoothness, and (d) pace. Expression 
evaluates the extent to which students’ reading is similar to natural 
language with adequate expression and volume. Phrasing is related 
to marking clause and sentence units. Smoothness is the extent to 
which students easily resolve word and structure difficulties. Pace 
rates conversational pace, whereby a high rating indicates that the 
students read neither too fast nor too slow (see Rasinski, 2004; 
Zutell & Rasinski, 1991, for the scale). Each aspect was rated on 
a scale of 1 to 4 (1 for not fluent reading to 4 for fluent reading). 
Three individuals (one doctoral student in education, one individ- 
ual with a master’s degree in education, and one individual with a 
bachelor’s degree in speech and language pathology) were rigor- 
ously trained with exact percent agreement ranging from .80 to .90, 
using 72 cases. 

Listening comprehension. Listening comprehension was 
measured by two normed tasks, the Oral Comprehension test of the 
Woodcock-Johnson III Tests of Achievement (henceforth WJOC; 
Woodcock, McGrew, & Mather, 2001) and the Listening Compre- 
hension subscale of the Oral and Written Language Scales, second 
edition (OWLS henceforth; Carrow-Woolfolk, 2011). In WJOC, 
the child heard a sentence or sentences and was asked to supply a 
missing word such as nouns, adjectives, and verbs (e.g., People sit 
in .; Eduardo is very attentive to the flower seeds that he 
plants in his garden every year. He tries to make sure all of the 
conditions are right so that everything will .). In OWLS, 
an assessor read increasingly difficult words, phrases, and sen- 
tences aloud, and the student responded by indicating which one of 
four pictures best depicted the stimuli (e.g., The sheep eats the 
grass; Although Bill Smith had said, “I won’t take any dogs with 
long ears,” the opposite situation actually occurred.). Cronbach’s 
alpha estimates ranged from .74 to .79 in WJOC and .91 to .93 in 
OWLS. 
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Word reading. Word reading was assessed with three tasks: 
the Letter-Word Identification (LWID) task of the Woodcock- 
Johnson Tests of Achievement (Woodcock et al., 2001), the Word 
Reading subtask of the Wechsler Individual Achievement Test, 
third edition (WIAT; Wechsler, 2009), and the Sight Word Effi- 
ciency (SWE) subtask of the Test of Word Reading Efficiency, 
second edition (Torgesen, Wagner, & Rashotte, 2012). In LWID 
and WIAT, the child was asked to read aloud words of increasing 
difficulty. SWE measured a student’s ability to read as many sight 
words of increasing difficulty as accurately as possible within a 
short time limit (45 s). Cronbach’s alpha estimates were .91 to .92 
in LWID and .95 in WIAT at each wave of data. Test—retest 
reliability for SWE was reported to be .93 (Torgesen et al., 2012). 


Procedure 


Students were individually assessed in several sessions of 
30-40 min per session by rigorously trained research assistants in 
quiet spaces in participating schools. 


Data Analytic Strategy 


Confirmatory factor analysis (CFA) and structural equation 
modeling were primary analytic strategies, using Mplus Version 
8.3 (Muthén & Muthén, 1998-2018). Because of the censored 
nature of the pause variables (both duration and frequency were 
censored from below, i.e., had values close to zero), all models 
were fit using maximum likelihood estimation with robust stan- 
dard errors (MLR) in Mplus. 

Research question 1: Dimensionality models of reading 
prosody. The dimensionality of the prosody indicators was de- 
termined using CFA. Performances on pause frequency and un- 
grammatical pause frequency were combined and averaged be- 
cause of extremely high correlations (see the Results section). 
Thus, students’ performance on four prosody indicators from spec- 
trogram data (intonation contour [i.e., vocalic nuclei], Fy change, 
pause frequency, and pause duration) and four indicators from the 
rating scale (expression and volume, phrasing, pacing, and 
smoothness) were averaged across passages per wave. These eight 
indicators were used in five CFA models that were fit to the data 
per wave. The five models were informed by theory and prior 
evidence. For example, a unidimensional model was tested as the 
baseline model where reading prosody is described as a single 
construct that is measured with various specific indicators or 
aspects (see Figure la). Alternatively, previous studies suggested 
that pitch and pause structure may be related but differentiated 
constructs (see Benjamin et al., 2013, and the literature review 
above). However, the factor structure of the four aspects in the 
Multi-Dimensional Fluency Scale, and their relations with other 
prosody indicators are unknown (see Figure 1b). It is also possible 
that various aspects of reading prosody have a bifactor structure 
with a general dimension or factor that captures commonality 
across all prosody indicators and residual specific factors (e.g., 
pitch, pause, or ratings; see Figure 1c). Variations of these models 
are also plausible (see Figure 1d and le). Below are detailed 
descriptions of each model. 

The first model was a unidimensional model (see Figure 1a), 
where a reading prosody factor was indicated by the four variables 
from the spectrograph data (pause frequency, pause duration, F, 


change, and intonation contour) and the four variables from the 
rating scale (expression and volume, phrasing, pacing, and 
smoothness). The second model was a correlated three-factor 
model (see Figure 1b), where a ratings factor was indicated by 
expression and volume, pacing, phrasing, and smoothness; a sec- 
ond pause factor was indicated by pause frequency and pause 
duration; and a third pitch factor was indicated by intonation 
contour and F, change. Third, a bifactor model was estimated (see 
Figure lc, bifactor 1), where the factors from the correlated three- 
factor model (ratings, pause, and pitch, which are called specific 
factors in the bifactor model) and a general factor orthogonal to 
these factors (i.e., the general factor was not allowed to covary 
with the specific factors) were indicated by the eight variables (i.e., 
the four variables for spectrograph and the four variables for the 
rating scale), and the specific factors were not allowed to be 
correlated. In a bifactor model (Gibbons & Hedeker, 1992), the 
general factor (in this case, reading prosody) captures common 
variance across all the manifest variables or indicators, and thus, it 
theoretically captures the most reliable portion of the variance for 
each of the indicators. The specific factors, orthogonal to the 
general factor and to each other, help to explain item response 
variance (item residual variance) that is not captured by the general 
factor (e.g., method variance). Bifactor 2 (see Figure 1d) was 
another bifactor model, which was identical to the bifactor | 
model, but the specific factors (i.e., ratings, pause, and pitch) were 
allowed to correlate. 

Finally, we fit a correlated trait bifactor model (see Figure le). 
In this model, there was a Prosody: Ratings and Pause general 
factor indicated by the four rating scale indicators (i.e., smooth- 
ness, pacing, phrasing, expression and volume) and two pause 
variables (pause frequency and pause duration). In addition, there 
was a separate Prosody: Pitch factor, indicated by Fy change and 
intonation contour. The Prosody: Ratings and Pause general factor 
remained orthogonal to the specific factors of Ratings and Pause, 
but the correlation between Ratings and Pause was allowed to be 
estimated. Prosody: Pitch was allowed to correlate with the general 
factor, Prosody: Ratings and Pause, and with the Ratings and 
Pause specific factors. This alternative model was informed by 
previous evidence and preliminary analysis in the present study. 
Pitch and pause variables measured by spectrographic analysis 
were related but different factors when using exploratory factor 
analysis (Benjamin et al., 2013), and pitch variables have weak 
relations with pause variables (Binder et al., 2013; Schwanenflugel 
et al., 2004). Furthermore, our preliminary analysis revealed that 
the pitch variables, intonation contour and F, change, had little to 
no loadings on the general prosody factor in the Figures 1c and 1d 
models in all the waves. 

Research question 2: The relations of word reading and 
listening comprehension to reading prosody. After establish- 
ing the dimensionality of the reading prosody variables, the rela- 
tions of word reading and listening comprehension to the reading 
prosody factors were investigated using the structural equation 
model in Figure 3. A latent variable for word reading was created 
from the three normed word reading measures (i.e., LWID, WIAT, 
and SWE), and a latent variable for listening comprehension was 
created from the two normed measures (i.e., WJOC and OWLS). 
Children’s demographic variables were not included in the analy- 
sis because none of them were statistically significant after ac- 
counting for word reading and listening comprehension. 
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Multiple criteria were used to determine model fit. The confir- 
matory fit index (CFI) and the Tucker-Lewis index (TLI) were 
used, whereby values above .90 are considered adequate and 
values above .95 are considered excellent (Hu & Bentler, 1999). 
We also used the root mean-squared error of approximation 
(RMSEA) and its associated confidence interval, where values 
under .08 are preferred (Kline, 2016). For determining the best 
fitting nested models, the difference in the Satorra-Bentler 
chi-square tests of model fit were used, whereby the preferred 
model was one in which the difference in Satorra-Bentler chi- 
square estimates was significantly better, or significantly closer 
to zero (Muthén & Muthén, 1998-2018). According to the 
nested model test available in Mplus 8.3 (Muthén & Muthén, 
1998-2018) and using the criteria outlined by Asparouhov and 
Muthén (2019), the five tested models were nested such that the 
unidimensional model (Hy) was nested in the correlated three- 
factor model (H,); the correlated three-factor model was nested 
in the bifactor Model 1; the bifactor Model 1 was nested in the 
bifactor Model 2; and the correlated trait bifactor model was 
nested in bifactor Model 2. In certain cases, if the assumption of 
nested models was not met (e.g., if the nested model [Ho] has 
degrees of freedom fewer than or equivalent to those of the 
comparison model [H,]; see Results section for examples of 
such models), we looked at the differences in the sample-size 
adjusted Bayesian information criterion (nBIC) between com- 
pared models. Models with nBIC values closer to negative 
infinity were preferred, where a difference of 5 was considered 
strong evidence for a better fitting model and a difference of 10 
was considered very strong evidence for a better fitting model 
(Raftery, 1995). 


Results 


Descriptive Statistics 


The descriptive statistics for each reading prosody indicator 
separated by wave are reported in Table 1 (descriptive statistics by 
passage across waves are found in the online supplemental mate- 
rials), and the descriptive statistics for word reading and listening 
comprehension variables by wave are presented in Table 2. Sample 
sizes were n = 350 in Wave 1, n = 366 in Wave 2, n = 326 in 
Wave 3, n = 329 in Wave 4, n = 311 in Wave 5, and n = 300 
in Wave 6. The number of students with missing data in each wave 
were as follows: n = 21 in Wave 1, n = 5 in Wave 2, n = 45 in 
Wave 3, n = 42 in Wave 4, n = 60 in Wave 5, and n = 71 in Wave 
6. As shown in Table 1, there were differing rates of missingness 
for each of the measured variables. The differing rates of missing- 
ness were mostly due to technical difficulties (e.g., digital recorder 
malfunction). For example, most missing data occurred in estimat- 
ing pause durations between sentences and measuring F, change 
across sentences, where these values were more likely to be 
missing for passage 2 and passage 3 for Wave | and Wave 2 (see 
Table S1 in the online supplemental materials for more informa- 
tion). In addition, some students had creaky voice (or glottal fry), 
and they were not included in the reading prosody coding. Based 
on chi-square difference tests, there were no differences on any of 
the demographic variables between students who had missing data 
and those who did not (ps > .1), with one exception in Wave 1: 
Students who qualified for the free and reduced lunch program 


Table | 


Descriptive Statistics for the Prosody Variables (Spectrograph and Rating Scale) by Wave, Averaged Across Passages 


Wave 6 


Wave 5 


Wave 4 


Wave 3 


Wave 2 


Wave | 


Max 


Min 


n 


Max M SD 


n Min 


SD 


Max M 


Min 


n 


SD 


M 


Max 


n 


Variable 


Spectrograph 


—44.51 19.22 


—60.33 23.71 284 —102.90 —3.75 


—15.20 


—53.96 21.12 302 —131.50 


—9.30 


—63.58 21.87 317 —111.60 


—1.65 


—19.63 —61.83 21.99 319 —127.90 


24.55 333 —141.40 


—62.97 


—9.30 


259 —158.10 


Fo change 


Intonation 


245.07 26.9 


185.20 308.150 


0.080 


30.17 286 
0.58 0.22 284 


247.2 


180.310 316.140 


0.210 


191.280 320.610 251.15 27.79 302 


190.90 321.490 254.61 28.81 318 


258.19 29.42 320 


27.66 337 183.60 331.040 


258.5 


199.40 319.80 
0.140 


263 
258 


contour 
Pause duration 


0.58 0.21 


1.330 


1.620 


0.67 0.28 300 


0.130 1.840 


0.33 316 


0.7 


2.080 


0.86 0.46 319 0.160 


1.370 1.29 335 0.210 3.060 


10.80 


Pause frequency 


13.250 5.11 3.18 285 0 13.50 445 3.14 


0.330 


9.35 4.79 301 


20 


0 


11.54 6.35 317 


26 


1.330 


4.86 320 


9.51 


1 


7.86 335 


12.83 


31 


a 


(Avg.) 


Rating scale 


Smoothness 


Pacing 


2.89 0.73 


4 


1 


2.73 0.84 270 
2.79 0.82 270 
2.93 0.76 270 


4 


1 


2.34 0.77 276 


239 0.9 


1 


1.96 0.74 312 
1.98 0.83 312 


4 
2.08 0.81 312 


1 


0.68 307 
1.77 0.73 307 
1.85 0.73 307 


1.73. 0.79 327 


3.670 


1 


259 
258 
259 


3.04 0.75 


276 
276 


4 
4 


0.79 327 


3.08 0.7 


4 


1 


249 08 


4 


4 


1 


0.79 327 1 


1.78 


Phrasing 


Expression and 


0.57 


1 


2.97 0.67 270 


4 


2.68 0.8 276 


4 


2.32 0.74 312 


1 


2.05 0.73 307 


4 


1 


0.78 328 


1.83 


volume 


fundamental 


Total sample sizes per wave were as follows: n = 350 in Wave 1, n = 366 in Wave 2, n = 326 in Wave 3, n = 329 in Wave 4, n = 311 in Wave 5, and n = 300 in Wave 6. Fy 


frequency. Pause duration is reported in seconds. 


Note. 


ullied publishers. 


is not to be disseminated broadly. 


ed by the American Psychological Association or one of it 


ight 
This article is intended solely for the personal use of the individual user ar 


Table 2 


Descriptive Statistics for the Word Reading and Listening Comprehension Variables by Wave 


Wave 6 


Wave 5 


Wave 3 Wave 4 
SD Min Max SD Min Max 


Wave 2 


Wave | 


SD 


SD Min Max M 


M 


SD Min Max M 


M 


M SD Min Max M 


Min Max 


Variable 


Listening comprehension 


3.60 


13 29 21.68 
13.53 71 113.24 12.03 
96.91 


3.77 


18.89 
106.23 


28 


153 


10 


3.69 
12.69 58 


19.16 
113.25 


9 


106.02 12.67 71 


8 


13.09 73 


WJOC — Raw 


136 
117 
138 


146 
111 
137 


140 


112.26 13.40 71 
111 


148 
105 


133 


134 103.79 


56 


WJOC - SS 


10.17 


92.80 11.19 65 


108.77 13.41 


116 
135 


89.47 11.18 67 


83.81 11.60 56 


106.66 


78.40 12.68 46 


105.39 


69.80 12.44 49 
100.64 14.08 62 


95 
135 


40 


OWLS — Raw 


13.67 


110.21 


a7 


109.33. 13.70 57 


13.06 55 


135 


13.82 65 


50 


OWLS - SS 
Word reading 


61 45.59 647 33 63 48.21 673 34 65 50.67 6.44 
133 135 


133 


41.84 6.60 30 
108.48 12.07 69 


62 
135 


38.46 639 24 
112.47 12.89 73 


53 


57 32.49 6.75 25 
147 


17 
68 


LWID — Raw 


105.69 11.67 


105.54 12.14 59 


108.24 11.85 49 


142 


110.99 13.54 75 


LWID - SS 


8 


103.27 15.33 63 


5 


104.80 16.04 63 


43.69 14.25 


WIAT — Raw 


104.30 14.88 


139 


59 138 103.70 14.90 62 
61.69 


14.33 
12.03 


41 105.15 
83 


1 
138 


143 


148 


16.10 65 
14.98 


147 100.15 


60 


WIAT - SS 


66.21 11.53 
101.41 


90 
134 


11.82 27 


86 
136 


16 


58.50 


51.68 13.44 21 
103.86 16.57 55 


80 
137 


15 


2B 


144 


11 


31.72 


67 
144 


SWE — Raw 


101.15 15.93 55 15.49 


104.74 16.24 55 


105.01 16.79 55 


55 100.60 17.59 55 


SWE - SS 
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Total sample sizes per wave were as follows: n = 350 in Wave 1, n = 366 in Wave 2, n = 326 in Wave 3, n = 329 in Wave 4, n = 311 in Wave 5, and n = 300 in Wave 6. Min = minimum; 


Note. 


Letter-Word Identification; 


Woodcock-Johnson III oral comprehension; OWLS = Oral and Writing Language Scales; LWID 


Max = maximum; Raw = raw score; SS = standard score; WJOC = 


WIAT 


Wechsler Individual Achievement Tests; SWE = Sight Word Efficiency. 


were more likely to be missing than students who did not qualify 
for free and reduced lunch, x7(1) = 5.03, p = .025, Cramer’s V = 
.12. Complete information on these chi-square tests can be found 
in the online supplemental materials (Table S2). Little’s test of 
missing data, which tests whether data can be considered missing 
completely at random (MCAR), failed to establish MCAR within 
the data for Waves 1 to 4 (Wave 1, p < .001; Wave 2, p < .001; 
Wave 3, p = .035; Wave 4, p = .007), but the data could be 
considered MCAR at Wave 5, x7(3750) = 3719.23, p = .636, and 
at Wave 6, x7(3224) = 3206.64, p = .583. Missing data in Waves 
1 to 4 were mostly attributable to issues with the spectrographic 
measurement as mentioned above, so we considered these missing 
data to be missing at random (MAR). 

Beginning with the spectrogram data, intonation contour re- 
mained similar across the waves although average intonation con- 
tour was highest in Wave 1| (258.5) and lowest in Wave 6 (245.07). 
At all waves, the means for Fy change were negative, indicating 
that students, on average, decreased their pitch from the peak to the 
end of the sentence. The frequency of pauses within a sentence and 
the duration of pauses between sentences decreased over time. 
Pause frequencies and durations decreased from an average of 12.8 
pauses within a sentence and 1.37 s between sentences in Wave 1 
to an average of 4.45 pauses within a sentence and 0.58 s between 
sentences in Wave 6. Total pause frequencies and ungrammatical 
pause frequencies were nearly perfectly correlated (.98 = rs = 
.99); therefore, we averaged the frequencies of ungrammatical 
pauses and total pauses across the three passages per wave to get 
an average pause frequency metric seen in Table 1. Average scores 
on the rating scale increased over time, such that students read with 
better pacing, smoothness, phrasing, and expression and volume 
from the fall of Grade 1 (Wave 1: average rating 1.73—1.83) to the 
spring of Grade 3 (Wave 6: average rating 2.89-3.10). 

Students’ mean performances on word reading and listening 
comprehension were in the average and somewhat high average 
ranges. For listening comprehension tasks, mean standard scores 
ranged from 100.64 in OWLS at Wave 1 to 113.25 in WJOC at 
Wave 4. For the word reading tasks, mean standard scores ranged 
from 100.15 in WIAT at Wave 1 to 112.47 in LWID at Wave 2. 

Because of space constraints, the 13 X 13 correlation matrices 
per wave across the included tasks are presented in tables in 
Appendix A. Among the spectrogram variables, intonation contour 
was significantly and negatively related to F, change (—.46 = 
rs = —.35) such that those with greater intonation contour made 
greater changes in Fy. Fy change was weakly related with pause 
frequency (.10 = rs = .27). There were practically no relations of 
intonation contour with pause frequency (—.11 = rs = .01) and 
pause duration (.00 = rs = —.04). There were moderate correla- 
tions between pause duration and pause frequency, which in- 
creased over time (r = .32 in Wave | to r = .51 in Wave 6). 
Among the rating scale indicators, there were strong correlations in 
each wave (.73 = rs = .92). Fy change, pause frequency, and 
pause duration were negatively and weakly to strongly related with 
rating scale variables (—.79 = rs = —.10), whereas intonation 
contour tended to have no relation or positive but weak relations 
with rating scale variables (—.01 = rs = .27). 

There were negative correlations between the spectrogram mea- 
sures and the word reading (—.79 = rs = —.18) and oral language 
measures (—.42 = rs = —.04), with the exception of intonation 
contour, which was not related to any language or word reading 
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variables at any wave (—.03 = rs = .11). There were moderate to 
strong correlations between the rating scale variables and the word 
reading measures (.55 = rs = .77), and weak to moderate corre- 
lations between the rating scale variables and the oral language 
measures (.23 = rs = .43). As expected, correlations among the 
word reading measures were very strong (.73 = rs = .91), and 
correlations between the oral language measures were moderate to 
strong (.57 = rs = .64). 


Research Question 1: Dimensionality of Reading 
Prosody Variables 


As described above, five alternative models were fit within each 
wave to determine the dimensionality of the reading prosody 
indicators (see Figure 1). Note that for ease of interpretation, F, 
change was reverse coded for the dimensionality and predictive 
models below so that positive Fy change values indicate greater Fy 
change. Table 3 presents model fit indices and model comparisons. 
Overall, the correlated trait bifactor model (Figure le) was selected 


as the final model for the following reasons. First, for nested 
models, model fit differences were examined by the difference in 
the Satorra-Bentler chi-square test. As shown in the last two 
columns of Table 3, the correlated three-factor model (Figure 1b) 
was preferred to the unidimensional model (Figure 1a) in all 
waves. The bifactor 1 model (Figure 1c) was preferred to the 
correlated three-factor model in Waves 2, 3, and 4. The bifactor 2 
model (Figure 1d) was preferred to the bifactor 1 model in Wave 
4. The bifactor 2 model was preferred to the correlated three-factor 
model in Waves 1, 5, and 6. The correlated trait bifactor model 
(Figure le) was preferred to the bifactor | model in Wave 3 and to 
the bifactor 2 model in Wave 6. In Wave 5, the correlated trait 
bifactor model did not fit significantly differently compared with 
the bifactor 2 model. 

Second, for the models that did not meet the assumption of 
nested models (i.e., model fit comparisons of the correlated trait 
bifactor model with the other models in Waves 1, 2, and 4), 
comparisons were conducted using nBIC (see the Data Analytic 


Table 3 
Model Fit Statistics for the Dimensionality Analyses 
Model x df SCF Dp CFI TLI RMSEA = 90% Cl nBIC Comparison Model test 
Wave | 
1. Unidimensional 58.49 20 1.06 <.001 OT .96 .08 .06, .11] 8498.07 
2. Correlated three-factor 31.17. 18 97 03 99 98 05 .02, .08] 8471.18 2 vs. 1 Ax?(2) = 16.99*** 
3. Bifactor | 27.93 15 88 02 99 .98 .06 .02, .08] 8472.66 3 vs. 2 Ay?(3) = 3.98"8 
4. Bifactor 2 10.93 Ly 85 45 1.00 1.00 00 .00, .06] 8467.16 4vs.2 Ay?(7) = 19.78"* 
5. Correlated trait bifactor 13.12 11 .67 29 1.00 1.00 03 [.00, .07] 8466.69 Svs. 4 AnBIC = .47¢ 
Wave 2 
1. Unidimensional 90.41 20 97 <.001 96 94 10 81, .12] 9770.32 
2. Correlated three-factor 4145 18 O7 001 99 98 .06 04, .09] 9728.12 2 vs. 1 Ay?(2) = 48.96"** 
3. Bifactor 1 27.15 15 94 .03 99 99 05 .02, .08] 9721.51 3 vs. 2 Ax?(3) = 13.11** 
4. Bifactor 2 42.69 11 65 <.001 98 95 09 .06, .12] 9734.42 4vs.3 Ax?(4) = —1.23"8 
5. Correlated trait bifactor 21.97 10 81 -02 99 98 06 [.03, .09] 9726.91 Svs. 3 AnBIC = 5.4% 
Wave 3 
1. Unidimensional 127.30 20 1.02 <.001 94 91 13 -11,.15] 9386.87 
2. Correlated three-factor 69.34 18 1.01 <.001 97 95 09 .07, .12] 9332.35 2 vs. 1 Ax?(2) = 53.86°"* 
3. Bifactor | 55.56 15 98 <.001 98 .96 09 .07, 12] 9324.41 3 vs. 2 Ay?(3) = 13.44** 
4. Bifactor 2 48.74 11 90 <.001 98 94 10 .08, 13] 9324.54 4vs.3 Ay?(4) = 8.825 
5. Correlated trait bifactor 19.65 12 97 .07 1.00 99 04 [.00, .08] 9297.10 Svs. 3 Ax?(3) = 14.26°* 
Wave 4 
1. Unidimensional 124.74 20 98 <.001 94 92 13 -11,.15] 9057.20 
2. Correlated three-factor 68.80 18 99 =<.001 97 .96 09 .O7, .12] 9008.14 2 vs. 1 Ay?(2) = 60.82"** 
3. Bifactor | 54.11 15 95 <.001 98 .96 09 .07, 12] 8999.55 3 vs. 2 Ay?(3) = 14.04** 
4. Bifactor 2 36.61 11 82 <.001 99 .96 09 .06, .12] 8988.51 4vs.3 Ax?(4) = 16.36°* 
5. Correlated trait bifactor 28.98 10 99 001 99 97 08 [.05, 11] 8989.65 Svs. 4 AnBIC = 1.14 
Wave 5 
1. Unidimensional 147.44 20 1.03 <.001 92 .88 15 .12,.17] 8309.38 
2. Correlated three-factor 72.01 18 1.05 <.001 .96 95 10 .08, .12] 8238.00 2 vs. 1 Ax?(2) = 89.71°"* 
3. Bifactor | 67.31 15 1.00 <.001 7 94 Al .08, 13] 8237.09 3 vs. 2 Ay?(3) = 6.38" 
4. Bifactor 2 19.96 11 98 046 99 99 05 .O1, .09] 8199.40 4vs.2 Ay?(7) = 48.32"** 
5. Correlated trait bifactor 23.72 12 1.04 -02 99 98 06 [.02, .09] 8201.88 Svs. 4 Ax?(1) = 3.00" 
Wave 6 
1. Unidimensional 130.18 20 1.06 <.001 92 89 14 12, .16] 7488.27 
2. Correlated three-factor 63.04 18 1.04 <.001 97 95 09 .07, .12] 7420.91 2 vs. 1 Ay?(2) = 58.41" 
3. Bifactor | 65.70 15 95 <.001 .96 93 11 .08, 14] 7425.63 3 vs. 2 Ay?(3) = 2.11"8 
4. Bifactor 2 31.55 11 1.06 <.001 99 .96 .08 05, .11] 7406.33 4vs.2 Ax?(7) = 31.85°"* 
5. Correlated trait bifactor 8.42 12 97 if) 1.00 1.00 -00 [.00, .04] 7378.58 Svs. 4 Ay?(1) = 7.10°* 


Note. 


SCF = scaling correction factor used for Satorra-Bentler chi-square tests; CFI = confirmatory fit index; TLI = Tucker-Lewis index; RMSEA = 


root mean squared error of approximation; nBIC = sample size—adjusted Bayesian information criterion; ns = not significant. Bold and italicized model 


is the best fitting model in each wave. 
* Not a practically important difference. 
“p< .0l. “p< .001. 
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Plan above). The correlated trait bifactor model did not fit practi- 
cally differently compared with the bifactor 2 model in Wave 1 
(AnBIC = 0.47) and Wave 4 (AnBIC = 1.14), and the correlated 
trait bifactor model did not fit practically differently compared to 
the bifactor 1 model in Wave 2 (AnBIC = 5.4). 

Finally, both Fy change and intonation contour did not load 
significantly onto the Prosody factor in the unidimensional model 
(Figure 1a) nor onto the general prosody factor in bifactor models 
in the bifactor 1 model (Figure 1c) and the bifactor 2 model 
(Figure 1d). Based on the model fit comparisons and the results of 
loadings, the correlated trait bifactor model (Figure le) was se- 
lected as the final model. Figure 2 shows the final model results for 
the six waves, which are detailed below. 

Beginning with the Prosody: Ratings and Pause general factor, 
the loadings of the rating scale indicators were large and positive 
(.85 = \ = .96), the loadings for pause frequency were large and 
negative (—.85 = \ = —.68), and the loadings for pause duration 
were negative and moderate in strength (—.54 = \ = —.42). The 
negative loadings of the pause frequency and pause duration 
variables indicate that the Prosody: Ratings and Pause general 
factor captures higher scores in the rating scale variables and lower 
frequencies and shorter durations of pause. To estimate reliability, 
factor reliability was calculated using McDonald’s coefficient 
omega (w; McDonald, 1999), which is shown to be an appropriate 
measure of factor reliability when data are modeled in a bifactor 
structure (Reise, 2012). Prosody: Ratings and Pause was a reliable 
factor in all waves (McDonald’s w range = .75-.78). The pattern 
of consistent loadings and high reliability indicates that the Pros- 
ody: Ratings and Pause general factor captured variance common 
to both the rating scale indicators and pause indicators. 

The Prosody: Pitch factor had similar patterns of loadings across 
all waves. Fy change loaded positively and strongly (.72 = \ = 
.97), and intonation contour loaded positively and moderately 
(.37 = \ S .50). Although the pattern of loadings was consistent, 
this factor was not consistently reliable (range across waves: 
McDonald’s w = .54-—.72; Wave 5 was the only wave to reach a 
McDonald’s omega criterion of at least .70). The correlation 
between Prosody: Pitch and the Prosody: Ratings and Pause gen- 
eral factor was positive, and the magnitude increased from weak to 
moderate over time: .16 at Wave | (p = .282), .27 at Wave 2 (p < 
.001), .30 at Wave 3 (p < .001), .29 at Wave 4 (p < .001), .32 at 
Wave 5 (p < .001), and .37 at Wave 6 (p < .001). The correlation 
between Prosody: Pitch and the Ratings specific factor was sig- 
nificant and positive in Waves 3 through 6 (.16 = rs = .91). 
Prosody: Pitch was significantly and positively related to the Pause 
specific factor in only Wave 3, r = .18, p = .023. 

Turning to loadings for specific factors, the patterns of loadings 
for the Ratings specific factor was not consistent (1.e., the loading 
pattern was not consistent over time). The only consistent loading 
was expression and volume, which loaded significantly at every 
wave except for Wave 2 (Wave 1 \ = —.38, p = .001; Wave 3 
dX = .46, p < .001; Wave 4 \ = .47, p < .001; Wave 5X = .13, 
p = .02; Wave 6 X = .47, p < .001). There were no remaining 
consistent indicators of the Ratings specific factor. The Ratings 
specific factor was also unreliable at all six waves (range: Mc- 
Donald’s w = .00-.21). The pattern of inconsistent loadings and 
unreliability in the Ratings specific factor across the six waves of 
data indicates that there was no additional construct-relevant vari- 


ance that could be accounted for beyond what was captured in the 
Prosody: Ratings and Pause general factor. 

The Pause specific factor was not a reliable factor in any wave 
(range: McDonald’s w = .04-.47). Pause duration was not signif- 
icant in Wave | (p = .19), loaded significantly but weakly in Wave 
2 (A = .28, p = .036), and loaded strongly and significantly in 
Waves 3 to 6 (.85 = \ = .88, ps < .001). The correlation between 
the Pause- and Ratings specific factors was not estimable in Wave 
1 because of a model nonconvergence issue and was fixed to zero; 
the correlation was significant and negative in Wave 3, r = —.17, 
p < .01 and was not significantly different from zero in the 
remaining waves (ps > .05). 

In summary, the correlated trait bifactor model, where the 
ratings and pause indicators had a bifactor structure and the pitch 
indicators were captured by another factor (Figure le), described 
the data best across all the waves. As stated above, the general 
factor, Prosody: Ratings and Pause, was the most reliable factor 
and captured common variance across the six ratings and pause 
indicators (expression, phrasing, smoothness, pace, pause fre- 
quency, pause duration). There was a trend that the Pause specific 
factor accounted for an additional portion of variance in the pause 
indicators over and above what was captured in the Prosody: 
Ratings and Pause general factor, but the Pause specific factor was 
not reliable in any wave. The Ratings specific factor was also 
unreliable across the waves over and above the Prosody: Ratings 
and Pause general factor. 


Research Question 2: The Relations of Word Reading 
and Listening Comprehension to the Identified 
Dimensions of Reading Prosody 


For the CFA models including listening comprehension and 
word reading (see Appendices B, C, and D for factor loadings, 
model fit statistics, and factor correlations, respectively), the load- 
ings were significant and strong to very strong for the word 
reading measures (.77 = \s = .98, ps < .001) and were strong to 
very strong for the listening comprehension measures (.69 = \s = 
.84, ps < .001). The correlations between word reading and 
listening comprehension were moderate (.52 = rs = .67, ps < 
.001). Word reading was positively and strongly correlated with 
the Prosody: Ratings and Pause general factor (.80 = rs = .87, 
ps < .001), was positively and weakly to moderately correlated 
with the Prosody: Pitch factor (.19 = rs = .34, ps < .001), and was 
negatively and weakly to moderately correlated with the Pause 
specific factor (—.35 = rs = —.22, ps < .003). Word reading was 
not related to the Ratings specific factor (ps > .27). Listening 
comprehension was moderately and positively correlated with the 
Prosody: Ratings and Pause general factor (.46 = rs = .55, ps < 
.001). Listening comprehension was not related to the Ratings 
specific factor (ps > .11) or Pause specific factor (ps > .05). 
Listening comprehension was positively and weakly related to the 
Prosody: Pitch factor in Wave 3, Wave 5, and Wave 6 (.15 S rs = 
19, ps < .05). 

The structural equation model shown in Figure 3 was modified 
by removing the pathways from word reading and listening com- 
prehension to the specific Ratings factor and the specific Pause 
factor. The specific Ratings factor was not reliable, nor did it have 
any significant relations with word reading and listening compre- 
hension factors in bivariate correlations (see the preceding para- 
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Figure 2. Dimensionality of prosody by wave. Residual variances were estimated, but not shown for figure 
brevity. Factor variances were fixed at 1 for model identification purposes. Gray, dashed pathways were not 
statistically significant (p > .05). Smth = smoothness; Pace = pacing; Phrase = phrasing; Expr = expression 
and volume; Pause Freq = pause frequencies; Pause Dur = pause durations; F, A = fundamental frequency 
“p< .05.“ p< .01. *“* p < .001. 


change; Int Cont = intonation contour. 


graph). The specific Pause factor was related to word reading, but 
this factor was not reliable. Below, we report results from the 
modified Figure 3 model, but the results of the original Figure 3 


model with all pathways is presented in Appendix E. 


The modified Figure 3 model was fitted across each of the six 


waves of data, using the Prosody: Ratings and Pause factor and the 
Prosody: Pitch factor as the outcomes. Results using standardized 
regression weights are presented in Figure 4 (factor loadings from 
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Figure 3. Structural equation model whereby word reading and listening comprehension predict reading 
prosody general and specific factors. All pathways were estimated, and factor variances were fixed at | for model 
identification purposes. Residual variances for all indicators are not shown for figure brevity. WIAT = Wechsler 
Individual Achievement Test; LWID = Letter-Word Identification; SWE = Sight Word Efficiency; OWLS = 
Oral and Written Language Scales; WJOC = Woodcock-Johnson II oral comprehension; Smth = MFS 
(Multi-Dimensional Fluency Scale) smoothness; Pace = MFS pacing; Phrase = MFS phrasing; Expr = MFS 
expression and volume; Pause Freq = pause frequencies; Pause Dur = pause durations; F, A = fundamental 


frequency change; Int Cont = intonation contour. 


the estimated model are presented separately in Table 4). Results 
at Wave 1, the fall of Grade 1, were as follows. Word reading was 
strongly related to the Prosody: Ratings and Pause general factor 
(y = .83, p < .001). After accounting for word reading, listening 
comprehension did not explain any additional variance in the 
Prosody: Ratings and Pause general factor (y = .03, p = .62). A 
similar pattern was found for the Prosody: Pitch factor such that 
word reading moderately and independently predicted Prosody: 
Pitch (y = .32, p < .001), but listening comprehension did not 
independently predict Prosody: Pitch (y = —.01, p = .88). The 
correlation between listening comprehension and word reading 
was moderate, r = .55, p < .O01. 

Results in subsequent waves were highly similar to those in 
Wave | (see Figure 4 and Table 4). Word reading was related to 
the Prosody: Ratings and Pause general factor (.78 = ys = .95, 
ps < .001) and the Prosody: Pitch factor (.21 = ys = .38, ps < 
.01), whereas listening comprehension was not (—.10 = ys = .08, 
ps = .05) after accounting for word reading. Across all six waves, 
68.3 to 80.6% of the variance was explained in the Prosody: 
Ratings and Pause general factor (Waves 1 to 6, respectively: 
72.0%, 74.9%, 68.3%, 74.4%, 77.9%, 80.6%) and 4.5 to 11.8% of 
the variance was explained in the Prosody: Pitch factor (Waves | 
to 6, respectively: 9.1%, 4.5%, 5.8%, 5.0%, 9.7%, 11.8%). 


Discussion 


Text reading fluency or oral reading fluency has been widely 
studied as an important skill in reading development. Although 
reading prosody has been recognized as an important part of the 
text reading fluency construct (Kuhn et al., 2010; National Institute 


of Child Health and Human Development, 2000), its evidence base 
is substantially more limited compared with that for text reading 
efficiency (accuracy and speed). Reading prosody has been exam- 
ined in multiple aspects, measured using either spectrogram or 
rating scales, which differ in degree of precision and practicality. 
Spectrographic measurements have strengths such as precise esti- 
mation, but wide use of them in the classroom setting is limited 
because of the specific expertise required to use them and the 
time-intensive nature in analyzing the data. Rating scales, on the 
other hand, are more classroom and teacher friendly, but their 
precision is not comparable with spectrographic measurements. 
Although multiple aspects of reading prosody have been widely 
examined using both approaches, no previous studies have inves- 
tigated the dimensionality of a comprehensive set of reading 
prosody indicators. To address this gap, we investigated reading 
prosody in terms of its dimensionality, its relations with word 
reading and listening comprehension, and the developmental na- 
ture of these dimensions and relations, using longitudinal data 
from Grade | to Grade 3. 


Dimensionality of Reading Prosody 


The findings of the present study advance our understanding of 
measurement and dimensionality of reading prosody in three im- 
portant ways. First, we found that reading prosody is multidimen- 
sional. When we examined dimensionality by systematically fit- 
ting and comparing five alternative models (see Figure 1), neither 
the unidimensional model (Figure 1a) nor the three-factor model 
(Figure 1b) was supported, indicating that the eight reading pros- 
ody indicators (expression, phrasing, smoothness, pace, pause du- 
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Figure 4. Standardized coefficients of word reading and listening comprehension predicting reading prosody 
latent variables by wave. Gray, dashed lines are not statistically significant (p > .05). Not shown but estimated 
were measurement models for word reading, listening comprehension, and reading prosody factors and the 
specific ratings and specific pause factors. ““ p < .01. *““ p < .001. 


ration, pause frequency, F, change, and intonation contour) do not 
capture a single underlying dimension or three separate but related 
dimensions of ratings, pause, and pitch. Three variants of bifactor 
models (Figures 1c—le) were also compared, and results revealed 
that the Figure le model, the correlated trait bifactor model, 
described the data best. That is, the rating scale and pause variables 
had a bifactor structure composed of a general factor (called 
Prosody: Ratings and Pause) and a Ratings specific factor and a 
Pause specific factor. In addition, the pitch-related indicators (in- 
tonation contour and Fy change) did not fit into the bifactor 
structure, but instead formed a separate but related dimension. The 
intonation contour and Fy change variables had very weak or weak 
relations with pause frequency, pause duration, and the ratings 
variables (see Appendix A), and consequently did not load on the 
general factor in the Figures 1c and 1d models. The weak relations 
are convergent with previous findings (e.g., Binder et al., 2013; 


Schwanenflugel et al., 2004), but extend previous studies by show- 
ing the factor structure of these variables. 

Second, our findings showed that the ratings indicators, as 
measured by MFS, together with pause structure indicators had 
a bifactor structure. Despite wide use of the MFS, its dimen- 
sionality and relations with other prosody indicators remained a 
black box. Our results revealed that the four indicators of the 
MFS had moderate relations with pause duration, strong rela- 
tions with pause frequency, and zero to weak relations with 
pitch variables! (see Appendix A). Moreover, the four indica- 


' The relations of the Ratings specific factor to the pitch factor varied 
largely from no statistically significant relations (Wave | and Wave 2) to 
a very strong relation (Wave 5). However, because the Ratings specific 
factor was mostly unreliable, these results may not be particularly mean- 
ingful. 
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tors had very strong loadings to the general factor, Prosody: 
Ratings and Pause (see Figure 2). Overall, these results indicate 
that the four aspects of the MFS, although they evaluate some- 
what different aspects—expression and volume, phrasing, 
smoothness, and pace—are largely measuring what is shared 
with pause structures in reading prosody rather than a pitch 
aspect, at least in early reading development from Grade 1 to 
Grade 3. These results are also in line with the very strong 
relation of word reading skill with the Prosody: Ratings and 
Pause general dimension (see Figure 4). 

Third, our speculation of a developmentally changing nature 
of relations among identified dimensions was supported. Spe- 
cifically, our data revealed a steady increase in the magnitude of 
the relation between the Pitch factor and the Ratings and Pause 
general factor over time—.16, .27, .30, .29, .32, and .37 in the 
six time points from the beginning of Grade 1 to the end of 
Grade 3. This is likely attributed to children’s development of 
word reading skill, which is strongly associated with the Pros- 
ody: Ratings and Pause general factor, and consequent decrease 
of the constraining role of word reading, freeing up cognitive 
resources for semantic processing (LaBerge & Samuels, 1974) 
and allowing the increasing relation with the Pitch factor. 


The Relations of Word Reading and Listening 
Comprehension to Reading Prosody 


Another striking finding is the consistent relation of word 
reading to reading prosody across the six time points from 
Grade | to Grade 3 (see Figure 4). Beyond this overall finding, 
there are a couple of important nuanced patterns and magni- 
tudes of relations that are revealing. First, word reading was 
consistently, strongly, and positively related to the Ratings and 
Pause general factor (.78-.95). In other words, higher word 
reading skill was associated with higher performance on the 
Prosody: Ratings and Pause general factor, indicating that the 
Prosody: Ratings and Pause general factor is very strongly 
influenced by word reading development. Second, it is note- 
worthy that word reading skill was also positively and weakly 
to moderately related to the Pitch factor (.21-.38), indicating 
higher word reading skill was associated with higher perfor- 
mance in pitch (greater Fy change and greater intonation con- 
tour). The relation of word reading to the pitch dimension can 
be interpreted according to the automaticity theory (LaBerge & 
Samuels, 1974) and the verbal efficiency theory (Perfetti, 
1992)—-word reading skill frees up cognitive resources to allow 
one’s attention to semantic processing, which, in turn, permits 
reading with greater pitch variation. The stronger relation of 
word reading skill with Prosody: Ratings and Pause general 
factor compared with with the Pitch factor is in line with prior 
work which showed strong relations of word reading with pause 
structure variables than with pitch variables (e.g., Binder et al., 
2013; Cowie, Douglas-Cowie, & Wichmann, 2002; Miller & 
Schwanenflugel, 2008; Schwanenflugel et al., 2004, 2015). 
These results underscore the importance of word reading skill in 
the pause-related dimension of reading prosody, and also add to 
the literature by showing that the four aspects examined in MFS 
are also strongly related to word reading skill. 

We hypothesized that listening comprehension would be 
related to reading prosody, particularly after the very beginning 


phase of reading development when word reading plays a large 
constraining role because reading prosody theoretically in- 
volves semantic processing (e.g., Kuhn et al., 2010). Listening 
comprehension had a moderate relation with the Prosody: Rat- 
ings and Pause general factor and a weak relation with the 
Prosody: Pitch factor in bivariate correlations (see Appendix 
D). However, it was not independently related to reading pros- 
ody after accounting for word reading in any of the develop- 
mental time points from Grade 1 through Grade 3. Furthermore, 
the magnitude of bivariate correlations between listening com- 
prehension and reading prosody indicators by and large re- 
mained similar across the time points (see Appendix D), not 
supporting the hypothesis about a changing relation between 
listening comprehension and reading prosody, at least for 
English-speaking children in Grades | to 3. This finding should 
not be taken as a lack of a relation of listening comprehension 
to reading prosody. Instead, this result might indicate and 
underscore the importance of word reading in reading prosody 
as word reading acts as a bottleneck during the beginning phase 
of reading development (Kim, 2015; Kim & Wagner, 2015). 
Future longitudinal studies beyond primary grades are neces- 
sary to elucidate whether listening comprehension or language 
skills make a unique contribution to reading prosody over and 
above word reading, and if so, when in the developmental phase 
this occurs. Additionally, studies have shown that text reading 
efficiency is a mediator of the relations of word reading and 
listening comprehension to reading comprehension (Kim, 2015; 
Kim & Wagner, 2015). Given the relation of word reading to 
reading prosody in the present study and prior work (e.g., Benjamin & 
Schwanenflugel, 2010; Binder et al., 2013; Schwanenflugel et al., 
2004), as well as the relation of reading prosody to reading compre- 
hension (e.g., Arcand et al., 2014; Binder et al., 2013; Calet et al., 
2015; Groen et al., 2019; Klauda & Guthrie, 2008; Schwanenflugel et 
al., 2004; Veenendaal et al., 2014) and the relation of listening 
comprehension to reading comprehension (Adlof et al., 2006; Florit & 
Cain, 2011; Hoover & Gough, 1990; Kim, 2017, 2020), a future 
investigation should shed light on a potential mediating role of read- 
ing prosody in the relations of word reading and listening compre- 
hension to reading comprehension. 


Limitations, Future Directions, and Implications 


As is the case with any study, results should be interpreted 
with the research design in mind. First, generalizability of the 
findings is limited to populations that are similar to the sample 
in the present study—English-speaking children in primary 
grades. Second, data were not missing completely at random in 
Waves 1-4 (Grades | and 2), and therefore, this should be 
considered in the generalizability of the findings. Third, al- 
though we included a comprehensive set of widely used fea- 
tures or indicators of reading prosody, other prosodic features 
(e.g., adult-child Fy match; Comprehensive Oral Reading Flu- 
ency Scale, Benjamin et al., 2013) can be included in future 
studies. Fourth, previous studies indicated that reading prosody 
is influenced by text features such as syntactic structures (e.g., 
Miller & Schwanenflugel, 2006). In the present study, we used 
texts that were normed in the state where the study was con- 
ducted and that were not specifically developed for the role of 
text complexity in reading prosody. Therefore, although these 
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texts are similar to the texts that children are likely to encounter 
in real life, future studies where sentence structures and types 
(e.g., exclamation, sarcasm) are intentionally manipulated 
would be useful for examining questions related to text features. 

Another direction for future studies includes longitudinal 
investigations beyond the primary grades. The dimensionality 
of reading prosody and the predictive relations of word reading 
and listening comprehension to reading prosody may change 
with reading development as the constraining role of word 
reading decreases. In addition, in the present study we exam- 
ined listening comprehension as a predictor of reading prosody, 
given that listening comprehension captures oral comprehen- 
sion at the discourse level and involves semantic processes and 
draws on vocabulary and morphosyntactic and syntactic knowl- 
edge (e.g., Kendeou, Bohn-Gettler, White, & van den Broek, 
2008; Kim, 2015, 2017, 2020; Kim & Phillips, 2014; Lepola, 
Lynch, Laakkonen, Silvén, & Niemi, 2012). However, future 
studies can replicate and extend the present study by examining 
the relations of vocabulary and syntactic knowledge to reading 
prosody, controlling for word reading. Finally, previous studies 
have suggested the relations of reading prosody with lexical- 
level prosody—prosodic sensitivity (Schwanenflugel & Benja- 
min, 2017), text reading efficiency (Miller & Schwanenflugel, 
2008; Schwanenflugel et al., 2004), and reading comprehension 
(Benjamin & Schwanenflugel, 2010; Calet et al., 2015; Miller 
& Schwanenflugel, 2008; Schwanenflugel et al., 2004, 2015). 
Future investigations using longitudinal and experimental de- 
signs are necessary to further elucidate the nature of their 
relations. 

The dimensionality results suggest that reading prosody in- 
struction may attend to two different aspects: pitch and pause 
structure. Together with previous suggestions (e.g., Benjamin et 
al., 2013), these results suggest that evaluation of reading 
prosody in research and practice should consider both aspects, 
but keep in mind that prosody is largely a function of word 
reading skill at least for children in primary grades learning to 
read English. Also informative was the finding that the different 
aspects of MFS are largely shared with the pause structure 
aspect of reading prosody, again for English-speaking students 
in primary grades; therefore, inferences drawn from MFS can 
be made with this result in mind. 

Importantly, the present study, in conjunction with prior 
work, indicates that reading prosody instruction should not be 
isolated from word reading or text reading efficiency. Theoret- 
ically, word reading and text reading efficiency are necessary 
foundations for reading prosody given their constraining roles 
(Kuhn et al., 2010; LaBerge & Samuels, 1974). Empirically, 
word reading and text reading efficiency have strong relations 
to reading prosody in the present study as well as in previous 
ones (e.g., Calet et al., 2015; Miller & Schwanenflugel, 2008; 
Schwanenflugel et al., 2004, 2015). A study showed that in- 
struction of prosody versus reading rate had somewhat disparate 
effects on students’ reading such that feedback on reading rate, 
but not reading prosody, had a large effect on reading rate, 
whereas feedback on prosody had a large effect on students’ 
pause structure (e.g., pause after commas and between sen- 
tences; Ardoin, Morena, Binder, & Foster, 2013). However, 
literature on effective prosody instruction, let alone effective 
text reading fluency instruction that includes both word reading 


skill and reading prosody, is thin, and therefore, future studies 
are required. 
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Appendix A 
Correlation Matrices at Waves 1, 2, 3, 4, 5, and 6 


Table Al 
Correlation Matrices at Wave I (Below Diagonal) and Wave 2 (Above Diagonal) 


Variable 1 2 3 4 5 6 7 8 9 10 u 12 13 
1. Fy change a — 35 d 05 16 16 AT 21 13 04 21 20 18 
2. Int cont 35 02 04 03 Al 06 ll 05 03 05 04 —-01 
3. Pau freq 20 Ol = 46 72 73 75 69 31 29 69 71 71 
4. Pau dur 16 —.04 32 A4 AT 48 44 17 ll 42 40 Sl 
5. Smth —23 10 64 40 87 84. 71 29 24 67 71 73 
6. Pace —27 120 —61 —37 90 =< 88 80 31 29 71 73 75 
7. Phra —.26 120 —61 —37 92 90 aa 84 34 32 71 73 74 
8. Expr — 34 15-57 36 87 85 89 = 32 28 68 67 69 
9. WJOC 07 02 30 24 34 35 31 30 = 57 38 39 37 
10. OWLS —.13 01 -29  —-18 35 29 31 28 58 = 39 39 33 
11. LWID —27 08  —.61 — 45 3 71 74 67 37 43 = 90 84 
12. WIAT —.26 09 -67  —Al 16 2B 75 69 38 A4 90 = 87 
13. SWE —.30 05 -69 —.46 aT 75 75 70 36 Al 87 89 = 


Note. Fy Change = fundamental frequency change; Int con = intonation contour; Pau freq = pause frequencies; Pau dur = pause durations; Smth = MFS 
(Multi-Dimensional Fluency Scale) smoothness; Pace = MFS pacing; Phra = MFS phrasing; Expr = MFS expression and volume; WJOC = 
Woodcock-Johnson III oral comprehension; OWLS = Oral and Written Language Scales; LWID = Letter-Word Identification; WIAT = Wechsler 
Individual Achievement Test; SWE = Sight Word Efficiency. 


Table A2 
Correlation Matrices at Wave 3 (Below Diagonal) and Wave 4 (Above Diagonal) 


Variable 1 2 3 + B) 6 7 8 9 10 11 12 13 
1. Fy change — — 41 .26 04 18 21 25 33 07 08 22 .22 sd, 
2. Int cont — 42 —_ —.10 O1 lS A3 09 18 05 .06 -05 .03 .04 
3. Pau freq 2D —.10 —_ 44 74 78 78 71 a) 40 70 72 76 
4. Pau dur O01 .00 50 44 7 45 43 AS al'D 43 44 55 
5. Smth =,20 12 76 49 .89 87 .82 32 Al 65 69 69 
6. Pace —.24 18 —.78 — 48 .87 —_ 89 81 37 42 69 71 73 
7. Phra —.24 .20 —.76 — 45 84 89 _ 84 33 ak) .69 71 74 
8. Expr —.30 sl = 13: =a 79 79 83 _ 29 36 62 68 70 
9. WJOC —.05 05 —.38 =23 39 Al 35 sil _— 7 39 35 34 
10. OWLS —.16 05 —.38 Slt 40 43 39 37 .63 —_ AT A4 37 
11. LWID 22: .06 = 15 —.38 64 67 64 65 44 47 _ 1 79 
12. WIAT = 25: .05 1 —.34 62 64 61 .60 42 44 91 —_— .80 

13. SWE =21 11 —.79 = 511 68 68 66 67 35 38 .80 81 _ 


Note. Fo Change = fundamental frequency change; Int con = intonation contour; Pau freq = pause frequencies; Pau dur = pause durations; Smth = MFS 
(Multi-Dimensional Fluency Scale) smoothness; Pace = MFS pacing; Phra = MFS phrasing; Expr = MFS expression and volume; WJOC = 
Woodcock-Johnson III oral comprehension; OWLS = Oral and Written Language Scales; LWID = Letter-Word Identification; WIAT = Wechsler 
Individual Achievement Test; SWE = Sight Word Efficiency. 
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Table A3 

Correlation Matrices at Wave 5 (Below Diagonal) and Wave 6 (Above Diagonal) 

Variable 1 2 3 4 5 6 7 8 9 10 11 12 13 

1. Fy change — —.44 Zt 10 .18 27 .28 44 15 13 .26 122 32 
2. Int cont 46 11 .02 10 15 14 27 .09 10 .03 .04 sill, 
3. Pau freq .26 —.04 — 51 72 FS .73 .68 33 35 70 71 76 
4. Pau dur .09 .04 50 42 44 Al 39 521 .24 35 36 48 
5. Smth .10 01 .73 A4 84 86 74 .26 30 65 .64 70 
6. Pace —.18 .05 —.79 —.49 84 — 86 .73 .28 36 .66 .64 71 
7. Phra —.20 03 —.78 — Al .85 .85 — 74 .26 34 65 63 .68 
8. Expr —.36 15 —.70 — 45 .75 78 .79 — 23 33 59 58 65 
9. WJOC .09 O01 42 22 3D 35 36 32 —_— 62 48 Al 31 
10. OWLS —.21 .03 — Al =—,22 239 40 39 37 64 — 54 AT 35 
11. LWID .25 .02 74 38 .62 .65 .66 S57 33 52 — 90 74 
12. WIAT .26 .03 71 Al 58 61 61 55 51 49 .89 — 73 
13. SWE .29 O01 76 50 64 MAL .67 .62 Al 40 76 76 — 

Note. Fy Change = fundamental frequency change; Int con = intonation contour; Pau freq = pause frequencies; Pau dur = pause durations; Smth = MFS 


(Multi-Dimensional Fluency Scale) smoothness; Pace = MFS pacing; Phra = MFS phrasing; Expr = MFS expression and volume; WJOC = 
Woodcock-Johnson III oral comprehension; OWLS = Oral and Written Language Scales; LWID = Letter-Word Identification; WIAT = Wechsler 
Individual Achievement Test; SWE = Sight Word Efficiency. 
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Factor loadings for listening comprehension and word reading latent variables by wave 
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Table B1 
Factor Loadings for Listening Comprehension and Word Reading Factors Separated by Wave 
Wave | Wave 2 Wave 3 Wave 4 Wave 5 Wave 6 
Variable NK SE Pp dN SE Pp NK SE Dp dN SE Pp NK SE Dp NK SE Pp 
Word reading 
SWE 93 Ol <.001 90 Ol <.001 .85 02 <.001 .83 02 <.001 .80 .02 <.001 .77 02 <.001 
LWID 94 01 <.001 93 01 <.001 95 01 <.001 95 01 <.001 95 01 <.001 .98 01 <.001 
WIAT 26 Ol <.001 97 07 <.001 95 O01 <.001 .96 01 <.001 .94 01 <.001 .93 O01 <.001 
Listening comprehension 
OWLS 82 05 <.001 .75 05 <.001 .82 04 <.001 .83 05 <.001 .79 004 <.001 .84 04 <.001 
WJOC 71 05 <.001 .76 05 <.001 .78 04 <.00l 69 005 <.001 .81 04 <.001 .74 04 <.001 


Note. 


(Appendices continue) 


dX = loading; SWE = Sight Word Efficiency; LWID = Letter-Word Identification; WIAT = Wechsler Individual Achievement Test; OWLS = 
Oral and Written Language Scales; WJOC = Woodcock-Johnson III oral comprehension. 
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Appendix C 
Model fit for confirmatory factor models of listening comprehension, word reading, and prosody indicators by wave 


Table Cl 

Model fit for CFA Models of Listening Comprehension, Word Reading, and Prosody Indicators by Wave 

Wave xX’ df SCF Dp CFI TLI RMSEA 90% CI nBIC 
Wave | 83.01 48 0.99 001 0.99 0.98 05 [.03, .06] 19045.58 
Wave 2 80.36 48 0.99 002 0.99 0.99 04 [.03, .06] 20912.72 
Wave 3 114.90 49 0.99 <.001 0.98 0.97 .06 [.05, .08] 19317.37 
Wave 4 117.31 48 0.99 <.001 0.98 0.97 07 [.05, .08] 18996.82 
Wave 5 104.32 49 0.99 <.001 0.98 0.97 .06 [.04, .08] 17784.39 
Wave 6 112.23 48 1.01 <.001 0.98 0.96 07 [.05, .08] 16546.06 


Note. CFA = confirmatory factor analysis; SCF = scaling correction factor used for Satorra-Bentler chi-square tests; CFI = confirmatory fit index; TLI = 
Tucker-Lewis index; RMSEA = root mean squared error of approximation; nBIC = sample size—adjusted Bayesian information criterion. 


Appendix D 


Correlations between reading prosody, word reading, and listening comprehension latent variables 


Table D1 
Factor Correlations for the CFA Models of Reading Prosody, Word Reading, and Listening Comprehension 


Wave | Wave 2 Wave 3 Wave 4 Wave 5 Wave 6 
Factor r SE Dp r SE p r SE Dp r SE Dp r SE Pp r SE Pp 

Prosody: Pitch factor with 

Prosody: Ratings and pause 23 07 001 18 .06 002 27 06 <.001 23° 06 <.001 24 06 <.001 .28 13 .026 

Ratings specific factor —.26 08 001 —.12 07 07 —.15 06 02 ~—.27 07 <.001 38 .06 <.001 .38 .10 <.001 

Pause specific factor =05: 07 52 05 .06 44 —.08 06 14 —.13 007 05 —.12 06 03 ~—.08 .17  .62 

Listening comprehension AS .08 07 12.07 .07 AS .07 — .03 11 07.09 19 07) 007. 17: 06 ~—.005 

Word reading 30 06 <.001 .19 .06 001 .26 05 <.001 .25 .05 <.001 32 .06 <.001 .34 .06 <.001 
Ratings specific factor with 

Pause specific factor —.03 10.75 08 17 63 n/a n/a n/a —.02 17 91 nia n/a n/a —.19 40 .64 

Listening comprehension 06 .08 46 —.07 .09 .46 Al .07 «1d 05 .09 61 01 .07 86 06 15 .67 

Word reading 04 09 65 —.02 .11 85 —.04 03 27 —.04 10 .70 —.02 04 .63 10.25 71 
Word reading with 

Listening comprehension 55.05 <.001 52 .06 .000 57 05 <.001 56 .06 <.001 67 05 <.001 .60 .06 <.001 

Prosody: Ratings and pause 84 03 <.001 .85 .02 .000 .80 .02 <.001 85 .02 <.001 85.03 <.001 .87 .04 <.001 

Pause specific factor —.22 04 <.001 —.23 .04 .000 —.35 .04 <.001 —.24 .06 <.001 —.24 .04 <.001 —.26 .09  .003 
Prosody: Ratings and pause with 

Listening comprehension 49 006 <.001 46 .05 .000 53 .05 <.001 53.05 <.001 55 05 <.001 47 06 <.001 
Pause specific factor with 

Listening comprehension —.12 07 09 —.09 .06 13 —.08 06 19 —.12 06 05 -—.10 07 14 ~—.08 08 31 


Note. CFA = confirmatory factor analysis. 
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Appendix E 


Standardized coefficients of word reading and listening comprehension predicting all the reading prosody latent 
variables, including the specific Ratings factor and the specific Pause factor, by wave 
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Figure El. Standardized coefficients of word reading and listening comprehension predicting reading prosody latent 
variables by wave. Gray, dashed lines are not statistically significant (p > .05). Not shown but estimated were measurement 
models for word reading, listening comprehension, and reading prosody factors. * p < .05. ““* p < .001. 
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Results of the original model in Figure 3, which includes the 
pathways from word reading and listening comprehension to the 
Ratings specific factor and the Pause specific factor, are presented 
in Figure El. 

In Wave 1, word reading negatively and independently related 
to the Pause specific factor (y = —.61, p < .001), whereas 
listening comprehension did not (p = .81). Neither listening com- 
prehension nor word reading predicted the Ratings specific factor 
(p = .52 and p = .71, respectively). The model explained 38.8% 
of the variance in the Pause specific factor and no significant 
variance in the Ratings specific factor (p = .79). In subsequent 
waves, word reading was related to the Pause specific factor 
(—.96 = ys S —.66, ps < .001), whereas listening comprehension 


was not (—.03 = ys = .13, ps = .05) after accounting for word 
reading. The Ratings specific factor was not predicted by word 
reading or listening comprehension in any subsequent wave. Be- 
tween 38.8% and 82.9% of the variance was explained in the Pause 
specific factor across waves (Waves | to 6, respectively: 38.8%, 
45.0%, 82.9%, 46.1%, 51.5%, 63.1%) and no statistically signif- 
icant variance was explained in the Ratings specific factor (Waves 
1 to 6, respectively: 2.2%, 3.8%, 2.4%, 1.9%, 6.1%, 1.8%). 
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